Ensembles of Methods for Tweet Topic Classification

نویسنده

  • Gretel Liz De la Peña Sarracén
چکیده

This paper describes the system we developed for IberEval 2017 on Classification Of Spanish Election Tweets (COSET) task. Our approach is based on a weighted average ensemble of five classifiers: 1) a classifier based on logistic regression; 2) a support vector machine classifier; 3) a Naive Bayes classifier for multinomial models; 4) a Guassian Naive Bayes classifier; and 5) a classifier implementing the k-nearest neighbors vote. Each such classifier was choice taking into account its contributes to the success of the system. The aim is to design a approach by using a voting method, where individual classifiers can have weaknesses. The performance of the ensemble is compared to the individual classifiers, and the experimental results show that the ensemble has better results.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A High-Performance Model based on Ensembles for Twitter Sentiment Classification

Background and Objectives: Twitter Sentiment Classification is one of the most popular fields in information retrieval and text mining. Millions of people of the world intensity use social networks like Twitter. It supports users to publish tweets to tell what they are thinking about topics. There are numerous web sites built on the Internet presenting Twitter. The user can enter a sentiment ta...

متن کامل

User interest prediction for tweets using semantic enrichment with DBpedia

This paper focuses on topic-based prediction of interest of individual users to posts in the context of Twitter. Two methods for enriching tweets using DBpedia for the purposes of classification are proposed. The first method incorporates entity linking and uses linked entities in a tweet to improve classification, whereas the second method aims to improve upon the first one by adding informati...

متن کامل

MCG-ICT at MediaEval 2015: Verifying Multimedia Use with a Two-Level Classification Model

The Verifying Multimedia Use task aims to detect misuse of online multimedia content and verify them as real or fake. This is a highly challenging problem because of strong variations among tweets from different events. Traditional approaches train the classifier at message level, which ignores inter-message relations. We propose a two-level classification model to exploit the information that ...

متن کامل

Learning Topical Translation Model for Microblog Hashtag Suggestion

Hashtags can be viewed as an indication to the context of the tweet or as the core idea expressed in the tweet. They provide valuable information for many applications, such as information retrieval, opinion mining, text classification, and so on. However, only a small number of microblogs are manually tagged. To address this problem, in this work, we propose a topical translation model for mic...

متن کامل

A New Document Embedding Method for News Classification

Abstract- Text classification is one of the main tasks of natural language processing (NLP). In this task, documents are classified into pre-defined categories. There is lots of news spreading on the web. A text classifier can categorize news automatically and this facilitates and accelerates access to the news. The first step in text classification is to represent documents in a suitable way t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017